Last Updated: December 19, 2025
A Version Control System (VCS) is a tool that helps individuals and teams manage changes to source code or files over time. It records a history of edits, enables collaborative development, and allows users to revert to previous versions if needed.
No staged files
Add project structure
1h agoe4f5g6h2 filesInitial commit
2h agoa1b2c3d1 fileKey features of a typical VCS include:
Popular version control tools include Git, Subversion (SVN), and Mercurial.
In this chapter, we will explore the low-level design of a simplified version control system.
Let's start by clarifying the requirements:
Before starting the design, it's important to ask thoughtful questions to uncover hidden assumptions, clarify ambiguities, and define the system's scope more precisely.
Here is an example of how a conversation between the candidate and the interviewer might unfold:
Candidate: Should the system support nested directories and a hierarchical file structure?
Interviewer: Yes, the system should support managing files in a directory tree structure, similar to a typical file system.
Candidate: Should we store full snapshots of files or just the differences (deltas) between versions?
Interviewer: To keep things simple, store full snapshots for each commit. In real-world systems, we would optimize this using deltas.
Candidate: Do we need to support a staging area where users can select specific files to include in a commit?
Interviewer: No, assume that the entire repository is committed at once.
Candidate: Do we need to support branching and merging features?
Interviewer: Yes, basic branching should be supported. You can skip merging for now.
Candidate: Should the system allow users to roll back to a previous version?
Interviewer: Yes, the system should allow viewing commit history and reverting the repository to any prior commit.
Candidate: Should the system support viewing diffs between any two commits?
Interviewer: It would be nice to have, but let’s leave that out for this design.
Candidate: Should we use a command-line interface or just hardcode a sequence of operations for demonstration?
Interviewer: A hardcoded sequence is fine for demonstration purposes.
After gathering the details, we can summarize the key system requirements.
commit, checkout, branch, and revertCore entities are the fundamental building blocks of our system. We identify them by analyzing the functional requirements and highlighting the key nouns and responsibilities that naturally map to object-oriented abstractions such as classes, enums, or interfaces.
Let’s walk through the functional requirements and extract the relevant entities:
A version control system manages codebases, which are nothing more than directories containing files and subdirectories. To represent this structure:
FileSystemNode, which acts as the base class for anything that can exist in the repository, either a File or a Directory.File class represents a single file with its name and contents.Directory class can contain multiple FileSystemNode children, enabling a tree-like structure to model folders and subfolders.Each time a user commits, the system needs to save the entire state of the file system. This is where the Commit class comes in.
Commit object stores a full snapshot of the repository at a specific point in time.Directory of that snapshot.To manage and retrieve these commits efficiently:
CommitManager acts as a registry for all commits. It handles commit creation and lookup operations.Branches allow developers to work in isolated timelines. Each branch has its own set of commits.
Branch class represents a line of development. It keeps track of its name and points to the latest commit (HEAD).BranchManager is responsible for creating new branches, switching between them, and maintaining all the branches in the system.This setup ensures that multiple versions of the project can be developed independently.
To tie everything together, we need a top-level controller that understands the current state and user operations:
VersionControlSystem class plays this role. It manages the active branch, handles operations like commit, checkout, and revert, and interfaces with both the CommitManager and BranchManager.This class is the main entry point for any core version control operation.
For testing and demonstration purposes:
VersionControlSystemDemo class runs a predefined sequence of operations (like creating files, committing, switching branches) to showcase how the system behaves.This helps in validating the logic without building a full-fledged CLI.
These core entities define the key abstractions of a version control system and will guide the structure of your low-level design and class diagrams.
In this section, we outline the core classes involved in the design of a lightweight, in-memory version control system.
Represents a snapshot of the entire file system at a given point in time.
Responsible for creating and storing Commit objects.
Represents a named pointer to a chain of commits.
Manages multiple branches using a Map<branchName, Branch>.
The central controller class that exposes the public API of the version control system.
Problem it Solves: The system needs to manage a file system, which is a tree-like structure containing both individual files (leaves) and directories (containers/branches of the tree). We need a way to treat both individual objects and compositions of objects uniformly.
How it's Applied:
Benefits:
Where used: DirectoryNode and FileNode both inherit from a common Node base, and directories hold children of type Node.
Why: Lets you treat individual files and folders uniformly and walk the tree recursively.
Problem it Solves: The requirement is to store "full snapshots" of the file system for each commit. This means we need to create an independent, deep copy of the entire workingDirectory tree every time a commit is made. Manually iterating and creating new objects would be complex and error-prone.
clone() methods on FileNode and DirectoryNode to create deep copies of the filesystem snapshot.How it's Applied:
Benefits:
Problem it Solves: The internal workings of the VCS are complex. There are Commit objects, Branch pointers, a map of all historical commits, and the workingDirectory. A user shouldn't have to interact with all these components directly. They need a simple, high-level API.
How it's Applied:
Benefits:
Commit holds a snapshot (the memento) of the entire DirectoryNode.An abstract base class representing a node in the file system.
1class FileSystemNode(ABC):
2 def __init__(self, name: str):
3 self.name = name
4
5 def get_name(self) -> str:
6 return self.name
7
8 @abstractmethod
9 def clone(self) -> 'FileSystemNode':
10 pass
11
12 @abstractmethod
13 def print(self, indent: str):
14 passIt defines a common interface for files and directories, including methods for cloning (Prototype Pattern) and printing the structure.
A concrete subclass of FileSystemNode representing a file.
1class File(FileSystemNode):
2 def __init__(self, name: str, content: str):
3 super().__init__(name)
4 self.content = content
5
6 def get_content(self) -> str:
7 return self.content
8
9 def set_content(self, content: str):
10 self.content = content
11
12 def clone(self) -> 'FileSystemNode':
13 return File(self.name, self.content)
14
15 def print(self, indent: str):
16 print(f"{indent}- {self.name} (File)")It stores file content and supports deep cloning of file state for snapshotting during commits.
A concrete subclass of FileSystemNode representing a directory.
1class Directory(FileSystemNode):
2 def __init__(self, name: str):
3 super().__init__(name)
4 self.children: Dict[str, FileSystemNode] = {}
5
6 def add_child(self, node: FileSystemNode):
7 self.children[node.get_name()] = node
8
9 def get_child(self, name: str) -> Optional[FileSystemNode]:
10 return self.children.get(name)
11
12 def get_children(self) -> Dict[str, FileSystemNode]:
13 return self.children
14
15 def clone(self) -> 'FileSystemNode':
16 new_dir = Directory(self.name)
17 for child in self.children.values():
18 new_dir.add_child(child.clone())
19 return new_dir
20
21 def print(self, indent: str):
22 print(f"{indent}+ {self.name} (Directory)")
23 for child in self.children.values():
24 child.print(indent + " ")It can contain other FileSystemNode instances (files or subdirectories), supports recursive cloning, and enables hierarchical structure.
Represents a single commit in the version control system.
1class Commit:
2 def __init__(self, author: str, message: str, parent: Optional['Commit'], root_snapshot: Directory):
3 self.id = str(uuid.uuid4())[:8]
4 self.author = author
5 self.message = message
6 self.parent = parent
7 self.root_snapshot = root_snapshot
8 self.timestamp = datetime.now()
9
10 def get_id(self) -> str:
11 return self.id
12
13 def get_message(self) -> str:
14 return self.message
15
16 def get_author(self) -> str:
17 return self.author
18
19 def get_timestamp(self) -> datetime:
20 return self.timestamp
21
22 def get_parent(self) -> Optional['Commit']:
23 return self.parent
24
25 def get_root_snapshot(self) -> Directory:
26 return self.root_snapshotIt captures a snapshot of the file system (Directory), the commit metadata (author, message, timestamp), and a reference to its parent commit, forming a chain of history.
Handles the creation and retrieval of commits.
1class CommitManager:
2 def __init__(self):
3 self.commits: Dict[str, Commit] = {}
4
5 def create_commit(self, author: str, message: str, parent: Optional[Commit], root_snapshot: Directory) -> Commit:
6 new_commit = Commit(author, message, parent, root_snapshot)
7 self.commits[new_commit.get_id()] = new_commit
8 return new_commit
9
10 def get_commit(self, commit_id: str) -> Optional[Commit]:
11 return self.commits.get(commit_id)
12
13 def print_history(self, head_commit: Optional[Commit]):
14 if head_commit is None:
15 print("No commits in history.")
16 return
17
18 current = head_commit
19 while current is not None:
20 print(f"Commit: {current.get_id()}")
21 print(f"Author: {current.get_author()}")
22 print(f"Date: {current.get_timestamp()}")
23 print(f"Message: {current.get_message()}")
24 print("--------------------")
25 current = current.get_parent()Maintains a map of all commit IDs and provides functionality to print the commit history starting from a specific commit.
Represents a branch in the version control system.
1class Branch:
2 def __init__(self, name: str, head: Commit):
3 self.name = name
4 self.head = head
5
6 def get_name(self) -> str:
7 return self.name
8
9 def get_head(self) -> Commit:
10 return self.head
11
12 def set_head(self, head: Commit):
13 self.head = headEach branch has a name and a reference to its latest commit (the head).
Manages all branches in the system.
1class BranchManager:
2 def __init__(self, initial_commit: Commit):
3 self.branches: Dict[str, Branch] = {}
4 main_branch = Branch("main", initial_commit)
5 self.branches["main"] = main_branch
6 self.current_branch = main_branch
7
8 def create_branch(self, name: str, head: Commit):
9 if name in self.branches:
10 print(f"Error: Branch '{name}' already exists.")
11 return
12 new_branch = Branch(name, head)
13 self.branches[name] = new_branch
14 print(f"Created branch '{name}'.")
15
16 def switch_branch(self, name: str) -> bool:
17 if name not in self.branches:
18 print(f"Error: Branch '{name}' not found.")
19 return False
20 self.current_branch = self.branches[name]
21 print(f"Switched to branch '{name}'.")
22 return True
23
24 def update_head(self, new_head: Commit):
25 self.current_branch.set_head(new_head)
26
27 def get_current_branch(self) -> Branch:
28 return self.current_branchSupports creating new branches, switching between them, and updating the head commit of the current branch.
The central singleton class that coordinates all components.
1class VersionControlSystem:
2 _instance = None
3
4 def __new__(cls):
5 if cls._instance is None:
6 cls._instance = super().__new__(cls)
7 return cls._instance
8
9 def __init__(self):
10 if hasattr(self, '_initialized'):
11 return
12 self._initialized = True
13 self.commit_manager = CommitManager()
14 self.working_directory = Directory("root")
15 initial_commit = self.commit_manager.create_commit("system", "Initial commit", None, self.working_directory.clone())
16 self.branch_manager = BranchManager(initial_commit)
17
18 @classmethod
19 def get_instance(cls):
20 if cls._instance is None:
21 cls._instance = cls()
22 return cls._instance
23
24 def get_working_directory(self) -> Directory:
25 return self.working_directory
26
27 def commit(self, author: str, message: str) -> str:
28 parent_commit = self.branch_manager.get_current_branch().get_head()
29 snapshot = self.working_directory.clone()
30
31 new_commit = self.commit_manager.create_commit(author, message, parent_commit, snapshot)
32 self.branch_manager.update_head(new_commit)
33
34 print(f"Committed {new_commit.get_id()} to branch {self.branch_manager.get_current_branch().get_name()}")
35 return new_commit.get_id()
36
37 def create_branch(self, name: str):
38 head = self.branch_manager.get_current_branch().get_head()
39 self.branch_manager.create_branch(name, head)
40
41 def checkout_branch(self, name: str):
42 success = self.branch_manager.switch_branch(name)
43 if success:
44 new_head = self.branch_manager.get_current_branch().get_head()
45 self.working_directory = new_head.get_root_snapshot().clone()
46
47 def revert(self, commit_id: str):
48 target_commit = self.commit_manager.get_commit(commit_id)
49 if target_commit is None:
50 print(f"Error: Commit '{commit_id}' not found.")
51 return
52 self.working_directory = target_commit.get_root_snapshot().clone()
53 self.branch_manager.update_head(target_commit)
54
55 print(f"Repository state reverted to commit {commit_id}")
56
57 def log(self):
58 print(f"\n--- Commit History for branch '{self.branch_manager.get_current_branch().get_name()}' ---")
59 head_commit = self.branch_manager.get_current_branch().get_head()
60 self.commit_manager.print_history(head_commit)
61
62 def print_current_state(self):
63 print("\n--- Current Working Directory State ---")
64 self.working_directory.print("")It manages the working directory, delegates commit and branch operations, and supports key VCS operations like commit, revert, branch creation, checkout, and logging.
A demonstration class that simulates interactions with the version control system.
1class VersionControlSystemDemo:
2 @staticmethod
3 def main():
4 print("Initializing Version Control System...")
5 vcs = VersionControlSystem.get_instance()
6
7 # --- Initial State on 'main' branch ---
8 vcs.print_current_state()
9
10 # --- First Commit ---
11 print("\n1. Making initial changes and committing...")
12 root = vcs.get_working_directory()
13 root.add_child(File("README.md", "This is a simple VCS."))
14 src_dir = Directory("src")
15 root.add_child(src_dir)
16 src_dir.add_child(File("Main.java", "public class Main {}"))
17 first_commit_id = vcs.commit("Alice", "Add README and initial source structure")
18 vcs.print_current_state()
19
20 # --- Second Commit ---
21 print("\n2. Modifying a file and committing again...")
22 readme = root.get_child("README.md")
23 readme.set_content("This is an in-memory version control system.")
24 second_commit_id = vcs.commit("Alice", "Update README documentation")
25 vcs.print_current_state()
26
27 # --- View History ---
28 vcs.log()
29
30 # --- Branching ---
31 print("\n3. Creating a new branch 'feature/add-tests'...")
32 vcs.create_branch("feature/add-tests")
33 vcs.checkout_branch("feature/add-tests")
34
35 print("\n4. Working on the new branch...")
36 test_dir = Directory("tests")
37 root.add_child(test_dir)
38 test_dir.add_child(File("VCS_Test.java", "import org.junit.Test;"))
39 feature_commit_id = vcs.commit("Bob", "Add test directory and initial test file")
40 vcs.print_current_state()
41
42 # --- View history on feature branch ---
43 vcs.log()
44
45 # --- Switch back to main ---
46 print("\n5. Switching back to 'main' branch...")
47 vcs.checkout_branch("main")
48 # Notice the 'tests' directory is gone, as it only exists on the feature branch.
49 vcs.print_current_state()
50 vcs.log() # Log shows only main branch history
51
52 # --- Reverting ---
53 print("\n6. Reverting 'main' branch to the first commit...")
54 vcs.revert(first_commit_id)
55 vcs.print_current_state() # The README content is back to its original state.
56
57 # --- View history after revert ---
58 print("\nHistory of 'main' after reverting:")
59 vcs.log() # The head is now the first commit
60
61
62if __name__ == "__main__":
63 VersionControlSystemDemo.main()It walks through scenarios such as making commits, branching, switching branches, and reverting to previous commits.
Which core entity is most responsible for capturing the entire state of the repository at a specific point in time?
No comments yet. Be the first to comment!